Goto

Collaborating Authors

 anaphora resolution


Evaluating Prompt-Based and Fine-Tuned Approaches to Czech Anaphora Resolution

Stano, Patrik, Horák, Aleš

arXiv.org Artificial Intelligence

Anaphora resolution plays a critical role in natural language understanding, especially in morphologically rich languages like Czech. This paper presents a comparative evaluation of two modern approaches to anaphora resolution on Czech text: prompt engineering with large language models (LLMs) and fine-tuning compact generative models. Using a dataset derived from the Prague Dependency Treebank, we evaluate several instruction-tuned LLMs, including Mistral Large 2 and Llama 3, using a series of prompt templates. We compare them against fine-tuned variants of the mT5 and Mistral models that we trained specifically for Czech anaphora resolution. Our experiments demonstrate that while prompting yields promising few-shot results (up to 74.5% accuracy), the fine-tuned models, particularly mT5-large, outperform them significantly, achieving up to 88% accuracy while requiring fewer computational resources. We analyze performance across different anaphora types, antecedent distances, and source corpora, highlighting key strengths and trade-offs of each approach.


Towards Generating Automatic Anaphora Annotations

Taji, Dima, Zeman, Daniel

arXiv.org Artificial Intelligence

Training models that can perform well on various NLP tasks require large amounts of data, and this becomes more apparent with nuanced tasks such as anaphora and conference resolution. To combat the prohibitive costs of creating manual gold annotated data, this paper explores two methods to automatically create datasets with coreferential annotations; direct conversion from existing datasets, and parsing using multilingual models capable of handling new and unseen languages. The paper details the current progress on those two fronts, as well as the challenges the efforts currently face, and our approach to overcoming these challenges.


A Sequence-to-Sequence Approach for Arabic Pronoun Resolution

Murayshid, Hanan S., Benhidour, Hafida, Kerrache, Said

arXiv.org Artificial Intelligence

This paper proposes a sequence-to-sequence learning approach for Arabic pronoun resolution, which explores the effectiveness of using advanced natural language processing (NLP) techniques, specifically Bi-LSTM and the BERT pre-trained Language Model, in solving the pronoun resolution problem in Arabic. The proposed approach is evaluated on the AnATAr dataset, and its performance is compared to several baseline models, including traditional machine learning models and handcrafted feature-based models. Our results demonstrate that the proposed model outperforms the baseline models, which include KNN, logistic regression, and SVM, across all metrics. In addition, we explore the effectiveness of various modifications to the model, including concatenating the anaphor text beside the paragraph text as input, adding a mask to focus on candidate scores, and filtering candidates based on gender and number agreement with the anaphor. Our results show that these modifications significantly improve the model's performance, achieving up to 81% on MRR and 71% for F1 score while also demonstrating higher precision, recall, and accuracy. These findings suggest that the proposed model is an effective approach to Arabic pronoun resolution and highlights the potential benefits of leveraging advanced NLP neural models.


Anaphora Resolution in Dialogue: System Description (CODI-CRAC 2022 Shared Task)

Anikina, Tatiana, Skachkova, Natalia, Renner, Joseph, Trivedi, Priyansh

arXiv.org Artificial Intelligence

We describe three models submitted for the CODI-CRAC 2022 shared task. To perform identity anaphora resolution, we test several combinations of the incremental clustering approach based on the Workspace Coreference System (WCS) with other coreference models. The best result is achieved by adding the ''cluster merging'' version of the coref-hoi model, which brings up to 10.33% improvement 1 over vanilla WCS clustering. Discourse deixis resolution is implemented as multi-task learning: we combine the learning objective of corefhoi with anaphor type classification. We adapt the higher-order resolution model introduced in Joshi et al. (2019) for bridging resolution given gold mentions and anaphors.


Few-Shot Anaphora Resolution in Scientific Protocols via Mixtures of In-Context Experts

Le, Nghia T., Bai, Fan, Ritter, Alan

arXiv.org Artificial Intelligence

Anaphora resolution is an important task for information extraction across a range of languages, text genres, and domains, motivating the need for methods that do not require large annotated datasets. In-context learning has emerged as a promising approach, yet there are a number of challenges in applying in-context learning to resolve anaphora. For example, encoding a single in-context demonstration that consists of: an anaphor, a paragraph-length context, and a list of corresponding antecedents, requires conditioning a language model on a long sequence of tokens, limiting the number of demonstrations per prompt. In this paper, we present MICE (Mixtures of In-Context Experts), which we demonstrate is effective for few-shot anaphora resolution in scientific protocols (Tamari et al., 2021). Given only a handful of training examples, MICE combines the predictions of hundreds of in-context experts, yielding a 30% increase in F1 score over a competitive prompt retrieval baseline. Furthermore, we show MICE can be used to train compact student models without sacrificing performance. As far as we are aware, this is the first work to present experimental results demonstrating the effectiveness of in-context learning on the task of few-shot anaphora resolution in scientific protocols.


Anaphora Resolution in Dialogue Systems for South Asian Languages

Annam, Vinay, Koditala, Nikhil, Mamidi, Radhika

arXiv.org Artificial Intelligence

Anaphora resolution is a challenging task which has been the interest of NLP researchers for a long time. Traditional resolution techniques like eliminative constraints and weighted preferences were successful in many languages. However, they are ineffective in free word order languages like most SouthAsian languages.Heuristic and rule-based techniques were typical in these languages, which are constrained to context and domain.In this paper, we venture a new strategy us-ing neural networks for resolving anaphora in human-human dialogues. The architecture chiefly consists of three components, a shallow parser for extracting features, a feature vector generator which produces the word embed-dings, and a neural network model which will predict the antecedent mention of an anaphora.The system has been trained and tested on Telugu conversation corpus we generated. Given the advantage of the semantic information in word embeddings and appending actor, gender, number, person and part of plural features the model has reached an F1-score of 86.


Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment

Jumelet, Jaap, Zuidema, Willem, Hupkes, Dieuwke

arXiv.org Artificial Intelligence

Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment Jaap Jumelet jumeletjaap@gmail.com ILLC, University of Amsterdam Abstract Extensive research has recently shown that recurrent neural language models are able to process a wide range of grammatical phenomena. How these models are able to perform these remarkable feats so well, however, is still an open question. To gain more insight into what information LSTMs base their decisions on, we propose a generalisation of Contextual Decomposition (GCD). In particular, this setup enables us to accurately distil which part of a prediction stems from semantic heuristics, which part truly emanates from syntactic cues and which part arise from the model biases themselves instead. We investigate this technique on tasks pertaining to syntactic agreement and coreference resolution and discover that the model strongly relies on a default reasoning effect to perform these tasks. 1 Introduction Modern language models that use deep learning architectures such as LSTMs, bi-LSTMs and Transformers, have shown enormous gains in performance in the last few years and are finding applications in novel domains, ranging from speech recognition and writing assistance to autonomous generation of fake news. Understanding how they reach their predictions has become a key question for NLP, not only for purely scientific, but also for practical and ethical reasons. From a linguistic perspective, a natural approach is to test the extent to which these models have learned classical linguistic constructs, such as inflectional morphology, constituency structure, agreement between verb and subject, filler-gap dependencies, negative polarity or reflexive anaphora. An influential paper using this approach was presented by Linzen et al. (2016), who investigated the performance of an LSTM-based language model on number agreement.


Towards Compositional Distributional Discourse Analysis

Coecke, Bob, de Felice, Giovanni, Marsden, Dan, Toumi, Alexis

arXiv.org Artificial Intelligence

In the last couple of decades, the traditional symbolic approach to AI and cognitive science -- which aims at characterising human intelligence in terms of abstract logical processes -- has been challenged by so-called connectionist AI: the study of the human brain as a complex network of basic processing units [18]. When it comes to human language, the same divide manifests itself as the opposition between two principles, which in turn induce two distinct approaches to Natural Language Processing (NLP). On one hand Frege's principle of compositionality asserts that the meaning of a complex expression is a function of its sub-expressions, and the way in which they are composed -- distributionality on the other hand can be summed up in Firth's maxim "You shall know a word by the company it keeps". Once implemented in terms of concrete algorithms we have expert systems driven by formal logical rules on one end, artificial neural networks and machine learning on the other. Categorical Compositional Distributional (DisCoCat) models, first introduced in [4], aim at getting the best of both worlds: the string diagrams notation borrowed from category theory allows to manipulate the grammatical reductions as linear maps, and compute graphically the semantics of a sentence as the composition of the vectors which we obtain from the distributional semantics of its constituent words. In this paper, we introduce basic anaphoric discourses as mid-level representations between natural language discourse on one end -- formalised in terms of basic discourse representation structures (DRS) [2]; and knowledge queries over the Semantic Web on the other -- given by basic graph patterns in the Resource Description Framework (RDF) [19]. We construct discourses as formal diagrams of real-valued matrices and we then use these diagrams to give abstract reformulations of NLP problems: probabilistic anaphora resolution and question answering.


Improve Tree Kernel-Based Event Pronoun Resolution with Competitive Information

Kong, Fang (Soochow University) | Zhou, Guodong (Soochow University)

AAAI Conferences

Event anaphora resolution plays a critical role in discourse analysis. This paper proposes a tree kernel-based framework for event pronoun resolution. In particular, a new tree expansion scheme is introduced to automatically determine a proper parse tree structure for event pronoun resolution by considering various kinds of competitive information related with the anaphor and the antecedent candidate. Evaluation on the OntoNotes English corpus shows the appropriateness of the tree kernel-based framework and the effectiveness of competitive information for event pronoun resolution.


Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation

Ferrandez, A., Peral, J.

arXiv.org Artificial Intelligence

This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into English---issues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish zero pronouns) into the target language has been evaluated on unrestricted corpora. We have obtained a precision of 80.4% and 84.8% in the translation of Spanish and English pronouns, respectively. Although we have only studied the Spanish and English languages, our approach can be easily extended to other languages such as Portuguese, Italian, or Japanese.